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© A novdl^etal- 6 N-ecetylaliicosemlnyltrdnsferese. Its acceptor molecule, Idukofilalin. end a method 
for cloning proteins having enzymatic activity. 

® The present ioveniion provides a novel 0^■^ A^acotylghicosaminyltransferase, which forms core 2 oligosac- 
cnarlde structures tn O^lycans. and a novel acceptor molecute, leukosfalln, CD43, for core 2 fi'l^B 
acetyigiucosaminyltransferase acfivity. The amino add sequences and nucleic acid sequences encodino these 
molecules, as vvell as active fragments thereof, also are disclosed. A method for isolating nucleic acid sequences 
encodingf proteins havino enzymatic activity is disclosed, using CHO ceils that support replication of plasmid 
vectors having a polyoma virus origin of replication. A method to obtain a suitable ce« line that exoresses an 
acceptor molecule also Is disclosed. ^^r^^^ ^ 
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This work was supported by grants CA33000 and CA33e95 awarded by Che National Cancer Institute. 
The United States Government has certeln fights fo this invention. 

BACKGROUND OF THE INVENTION 

FIELD OF THE INVENTION 

This Invention relates generally to the fields of biocheniistry and molecular biology and more specifi- 
cally to a novel human enzyme, UD^GlcNAc:Gal^1-3GalNAc (GlcNAc to Gair^Ac) Af-acetyh 
glucosamlnyttransferase (core 2 ^1-6 /^-acetylglucosaminyltransferase; C2GnT). and to a novel acceptor 
molecule, loukosiaiin, CD43. for core 2 /V-acetylglucosaminyltransferase action. The Invention eddi- 

ttonaVy relates to DNA sequences encoding core 2 /9t^6 MacetylglucosaiYjInylirwsferase and leukoslalin. 
to vectors containing a C2GnT DNA sequence or a leukosialin DNA sequence, to recombinant host colls 
transformed with such vectors and to a method of transient e>cpraseion cloning in CHO cells for identifying 
and isoladng DNA sequences encoding specific proteins, using CHO cells expressing a suitable acceotor 
molecule. 

BACKGROUND INFOFUUIATION 

SO Most O-glycosldic oligosaccharides in mammalian glycoproteins are linked via Af-acetylgalacrosamine to 
the hydroxyl groups of serine or throonlne. These O^lycans can be classified into 4 different groups 
depending on the nature of the core portion of the oligosaccharides <5ee Fig. 1). Although less well studied 
than A/-glycans. O-glycans likely have important biological functions. Indeed, the presence of O-finked 
oltgosaccharides with the core 2 branch. Gal^1-3(QlcNAciSi-6)GalNAc. has been demonstrated in many 

2s biological processes. 

Piller et al.. J. Biol. Chem 263:15146-15150 (1988) reported that human T-ceil acthmtion Is associated 
with the conversion of core 1 -based tetrasaccharides to core 2-based hexasaccharldes on leukosialin. a 
m^r siaioglycoprotetn present on human T lymphocytes (see also Fig. 1). A similar increase in hexasac- 
charides was observed in peripheral bkiod lymphocytes of paHents suffering from T-cell leukemias (Saitoh 
et al.. Blood 77:1491-1499 (1991)). myelogenous leukemias (Brockhausen et al. Cancer Res. 51:1257-1263 
(1991)) and immunodefldency due to AIDS and the Wiskott-Aldrich syndrome (Piller et al,. J, Exp. Med. 
173:1501-1510 (1991)). In these patients' lymphocytes, changes in the amount of hexasaochaiides were 
caused by increased activity of eHher UDP.GIcNAc:Gal^l-3GalNAc (QlcNAc to GalNAc) 6-^-D-A^acetyl- 
glucosaminyltransferase (EC2.4.1.102) or core 2 flWB Macetylglucosaminyltransferase (Williams et al J 
Biol, Chem. 255: ii 253-1 1281 (i960)). Increased activity of core 2 ^l-e AZ-acetylglucosamlnyttransfera^ 
also was observed in melastalic murine tumor cell lines as compared to their parental, non-metastatic 
counterparts (Yousefi et al., J. Biol. Chem. 266:1772-1782 (1991)). 

Increased complexity of the attached oligosaccharides increases the molecular weight of the 
glycoprotein. For example, leukosialin coniajnlng hexasaccharides has a molecular weight of -135kDa 
whereas leukosialin containing tetrasaccharides has a molecular weight of -lOSkDa (Cartsson et al J Biol 
Chem. 261:12779-12786 and 12787-12795 (1986)). ' ^^^"^ 

Fox et al., J. immunol. 131:762.767 (1963) raised a monoclonal antibody, T30S, against human T- 
lymphocytlc leukemia colls. Sportsman et al.. J. Immunoi. 135:158-164 0985) reported T305 binding was 
abolished by neuraminidase treatment suggesting T305 binds to hexasaccharides, T305 specifically reacts 
4s with the high molecular weight form of leukosialin (Saitoh et al,. supra . (1991)). 

Previous studies indicated poly-/V-acotylIaclosamine repeals extend almost exclusively from the branch 
tormed by the core 2 ^l-*6 AAacetylghicosaminyltransfentee (Fukuda et al., J. Biol. Chem. 261:12796- 
12806 (1986)). Consistent with these results, Yousefi et al., supra . (1991) demonstrated that the core 2 
enzyme In metastatic tumor cells regulates the level of poly-Af-acetyllaclosamlnB synthesis in O-Gnked 
50 Oligosaccharides. 

Poiy-A^acelyllactosamlnes are subject lo a variety of modifications, including the formation of the sialyl 
Le", NeuNAC«2-3Qal^1-*4(Fuc*1-3)Gk:MAc-, or the sialyl Le», NeuNACa2-3Qal/J1-3 (Fuc»l-4)Gk;NAc- 
. determinants (Fukuda. 6k)chim. efonhys. Acta 780:119-150 (1985)). Such modincations are significant 
because these determinants, which are present on neutrophil:; and monocytes, serve as iigands for Br and 
P-seleclin present on endolhelial cells and platelets, respectively (see, for example. Larsen et al Cell 

63:467-474 (1990)). ' 

In addition, tumor ceUs often express a significant amount of sialyl Le'' and/or sialyl Le- on their cell 
surfaces. The Interaction between E-selectin or P-seleciin and these cell surface carbohydrates may play a 
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role In lumor cell adhesion to endoihelium during the meiastailc process (Waiz ei al., supra. (1990)). Kojima 
al- Biochefn. Blophys. Rgs. Commun^ 1 62:1 288- 1295 (1992) reported that sel9ctln-*depindent tumor cell 
adhesion to endothelial cells was abolrshed by blocking O^lycan synthesis. Complex Sulfated 0-gtycaos 
also may servo as ligands for the lymphocyte homing receptor, L-seiecti'n (Imai el ah. J. Cell Biol. 1l3:t2l3- 

9 1221 (1991)). 

These reponsd observations establish core 2 fll— 6 A/-acetylglucosamlnyltran$ferd$e as a critfcal 
enzyme in 0-glycan biosynthesis. The availability of core 2 /ffl-^ A^acetylglucosamrnyltransf erase wilt 
ailovy the in vivo and In vitro production of spedfic glycoproteins having core 2 oligosaccharides and 
subsequent study of these variant O-glyoans on cell-cell interactions. For example, core 2 ^1-*6 

10 ecelylglucosaminyltransferase Is d useful marker for transformed or cancerous cells. An understanding of 
the role of core 2 ^i--^ AAdcetylglucosaminyltrBnsrerase In transformed and cancerous cells may elucidate 
a mechanism for the aberrant cell-^ll interaclions observed In these cells. In order to undorstend th© 
control of expression of these ollgoeeccharidee and their KjnotiOn, isolation of a cDNA dona for core 2 
^1— $ W-acetylgbcosaminyltransferase is a prerequisite. However, the 0^4A eequence encoding core 2 

IS ^1-*e A^acGtylglucosaminyltransferase has not yet been reported. 

Thus, a need exists for Identifying the con^ 2 fi^i-^e AZ-ac^yigiucosaminyltransferase and the DNA 
sequences encoding this enzyme. The present invention satisfies this need and provides related advan- 
tages as well. 

20 SUMMARY OF THE INVEMTION 

The present invention generally relates lo a novel purified human ^acetylglucosamlnyltrans- 
ferase. A cONA sequence encoding a 428 amino add protein having fi^^S A^acetylglucosaminyl transferase 
activity also is provided. The purrfied human ^1-^ A^-acetylglucosaminyltransferase, or an active fragment 
i?5 thereof, catalyzes the formation of criticaJ branchas in C>glycans. 

The invention further relates to a novel purified acceptor molecule, leukoslalln. CD43. for core 2 iJI— 6 
A/-acetylg1ucosamlnyltransferase activity. The leukoeialin cONA encodes a novel variant leukosiatin. which 
is created by alternative splicing of the genomic laukosialtn DNA sequence. 

Isolated nucleic acids encoding either core 2 ^i-^6 ^-aceiylglucosaminyltransferase or leukosidfin are 
30 dlsck)sed, as are vectors containing the nucleic acids and recombinant host cells transformed with such 
vectors. The invention further provides methods of detecting such nucleic adds by contacting a sample with 
a nucleic acid probe having a nucleotide sequence capable of hybridizing with the isolated nucleic acids of 
the present invention. The core 2 ^l-«« AZ-acetylglucosaminylb-ansrerase and leukosidlin amino add and 
nudeic add sequences disclosed herein can be purified from human cells or produced using well known 
OS methods of recombinant DNA technology. 

The invention also discloses a metfiod of isolating nucleic acid sequences encoding proteins thai have 
an enzymadc activity. Such a nucleic add sequence Is obtained by transfecting the nucleic add, which is 
contained within a vector having a polyoma virus repik:atk}n origin, into a Chinese hamster ovary (CHO) cell 
lino simultaneously expressing polyoma virus large T anh'gen and the acceptor molecule for the protein 
<o having an enzymatic activity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the structures and biosynthesis of O-glycans. Struclures of CVglycan cores can be 
45 classified into 4 groups (core 1 to core 4). each of which is synthesized starting with GalNAcal—Ser/Thr, 
The core 1 structure is synthesized by the addition of a ^1-*3 Gal residue to the GaJNAp residue. The core 
1 stnjcture can be converted to core 2 by the addition of a Af-acetyfgfucosaminyl residua. TMis 

intermediate is usually converted to the hexasaocharide by sequential addition of galactose and sialic acid 
residues (bottom right>. The core 2 /?1-6 AA-acetylglucosamlnyltransferase and the linkage formed by the 
so enzyme are Indicated by a box. In certain ceU types, the core 2 structure can be extended by the addition 
of Af-acetyllactosamlne (Gal)8 w4Qk:NAc^l-»3) repeats to form poly^/^acelyllactosamlne. In the absence of 
core 2 ^i-e A^acetylglucosaminyltransferase. core i is converted to the monoslaloform. then to the 
disiaioform by sequential addition ol a2-3' and ft2-e-llnked sialic acid residues (bottom left). Alternatively, 
core 3 can be synthesized by the addition of a ^l-*3 /V-aceiylglucosamlnyl residue to the GalNAc residue. 
55 Core 3 can be converted to core 4 by another ^91-^ A^-acetylglucosaminyliransferaso (top oi figure). 

Rgure 2 depicts genomic ONA sequence (SEQ. ID. NO. 1) and cONA sequence (SEQ, ID. NO. 1) of 
leukosialin. The genomic sequence is numbered relative to the transcriptional start site. Exon 1 and exon 2 
have been previously described. Exon l' is newly IdentiHed her©, in the isolated cDNA. exon V is 
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immediately followed by the exon 2 sequence. Deduced amino acids ($EQ. ID. NO. 2) are presented under 
tlie coding sequence, which begins in exon 2. A portion of the exon 2 sequence is shown. 

Figure 3 estabhslf^es the ability of pGrfYiCG to repricale in Cl-lO cell lines expressing polyoma large T 
antigen and leukosialin. In panel A» six clonal CHO cell lines were examined for replication of pcDNAI-based 
9 pGT/hCG (lanes 1-6). In panel B, replication of ceil clone 5 (CHO-Py-leu). was further examined by 
treatmenl with Increasing concentrations of Dpnl and Xhol (lanes 2 and 3). Piasmid DMA isolated from 
MOP-8 cells was used as a control (lane 1). Plasmid DNA was extracted usir^g the Hirt procedure and 
samples were digested with Xhol and Dpnl. In parallel^ pGT/hCX3 plasmid purified from CO// MCl0ei/P3 
was digested with Xhol end Opol (lane 7 in panel A and lane 4 in panel B) or Xhol alone (iane 8 in panel A 
ID and lane 5 in panel B). The anow indicates the migration of plasmid ONA resistant to Dpnl digestion. The 
arrowheads indicate plasmid DNA digested by Dpnl. 

Figure 4 shows the expression of T305 antigen expressed by pcDNAI-C26nT. Subconfiuent CHO-Py- 
leu celfs were transfected with pcONAI-CZGnT (panels A end B) or mocl<-transfected with pcDNAl (panels C 
and D). Sixty four hours after transfection, the cells wore fixed, then Incubated with mouse T3Q5 monoclonal 
IS antibody followed by fluorescein isocyanate-conjugated sheep anti-moase JgG (panels A, B and C). Two 
different areas are shown in panels A and 6. Panel D ehows a phase micrograph of the eame field shown in 
panel C Bar - 20Um. 

Figure $ depicts the cDNA sequence (S£Q. ID. NO. 3) and translated amino add sequences (SEQ. ID. 
NO. 4) of core 2 /Jl— 6 Af-acetytgiuC03dminyltran$ferase The open reading frame and full-length nucleotide 
^ sequence of C2GnT are showm. The signal/hiembrane-anchoring domain is doubly underlined. The 
poiyadenylation signal is tioxed. Potential A/-^lycosylaHon siles ere marked with asledstca. The eequences 
are numbered relative to the translation stail site. 

Rgure 6 shows the expression of core 2 ^1— e A/- acctylglucosamlny I transferase mRNA in various cell 
types. Poly(A)* RNA (11 ug) from CHO-Py-leu cells (lane 1). HL-SO promyelocytes (lane 2), K562 
zs erythrocytic cells (lane 3). and SP and L4 colonic carcinonna cells (lanes 4 and 5) was resolved by 
electrophoresis. RNA was transferred to a nylon membrane and hybridized whh a radiolabeled fragment of 
pPR0TA'C2GnT. Migration of RNA size markers is indicated. 

Figure 7 illustrates the construction of the vector encoding the protein A-C2GnT fusion protein. The 
cDNA sequence con-esponding to Pro» to Hjs^^a was fused in frame with the IgG binding domain of 5. 
30 Qureus protein A (bottom: SEQ. ID. NOS. 7 and 8). The sequence includes the cleavable signal pepb'de, 
which allows secretion of the fused protein. Tiie coding sequence is under control of the SV40 promoter. 
The remainder of the vector sequence shown was derived from rabbit /3-gk>bin gene sequences, including 
an Iniervening sequence (IVS) and a poiyadenylation signal (An). 

95 D ETAILED DESCRIPTfON OF THE INVENTION 

The present invention generally relates to a novel human core 2 >3i— 6 A/-aceiylglucosaminyltransferase, 
The invention further relates to a nov/ei method of transient expression cloning in CHO cells that was used 
to isolate the cONA sequence encoding human core 2/31-6 AZ-acetylglucosaminyltransferase (C2enT). The 
invention also relates to a novel human leukosialin, which is an acceptor molecule for core 2 ^1-*6 M 
acetytglucosaminyltransferase activity. 

C?e«s generally contain extremely low amounts of giycosyitransferases. As a result. cDNA ck)nlng based 
on screening using an antibody or a projbe based on the glycosyllransferase amino acid sequence has met 
with limited success. However, isolation of cDNAs encoding various giycosyitransferases can be achieved 
by transient expression of cDNA in recipient cells. 

Successful application of the transient expression cloning method to isolate a cDNA sequence encoding 
e glycosyltransferase requires an appropriate recipient cell line. Ideal recipient cells should not express the 
glycosyltranslerase of interest. As a result, the recipient cells would normally lack the oligosaccharide 
structure formed by such a glycosyltransferase. 

Expression of the ctoned glycosyltransferase cl^A in the recipient cell line should result in tbrmation of 
the specific oligosaccharide structure. The resultant oligosaccharide can be identified using a specific 
antibody or lectin thai recognizes the structure. The recipient cell line also must support replication of an 
appropriate plasmid vector. 

COS-i ceils Initially appear to satisfy the requirements lor using the transient expression method. COS- 
1 ceUs express SV40 large T antigen and support the replication of plasmid vectors harboring a SV40 
replication origin (Glu^man et al.» Cell 23;i75-ie2 (l$8i)). Although COS-i celis. themselves, express a 
varieiy of giycosyitransferases. COS-1 cells have been used to clone cDNA sequences encoding human 
bkxjd group Lewis 01-5/4 fucosyltransferase and muring a1— 3 galdctoSyhransferase (KuKowska-Utallo el 
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al., Genes and Devel. 4: 1 288-1303 (1000); Larsen at al., Proc. Nati. Acad. Scl. USA 86:8227-8231 (19B9)). 
Also. Goel2 at al., Celj 6a;l75-l62 (1990), utilized an antibody that Inhllplts E-selactIn mediated adhesion to 
isolate a cDNA sequence encoding a1**3 fucosyftransferase. 

An attempt wds made to use COS-1 cells to isoiala cONA clones encoding core 2 /3W6 
5 aceiylglucosan^inyliransferase. COS-i cells ware tr^sfected using cDNA obtained from activated human T 
cells, which express the core 2 /Sl*-^ AZ-acetylglucosaminyltransferase. Transfected cells Suspected of 
exprassfng core 2 fi'i-^B AAacdtylglucosaminyltransf erase in the trarxsfecled cells were IdentHiad by the 
presence of increased levels of the* core 2 oligosaccharide structure fbrm^ by core 2 Q N- 
acetytglucosaminyltransferasa activity. The presence of the core 2 structure was identified using the 

10 monoclonal antibody, T305, which identifies a hexasaccharide on ieulcosialin. A clone SMpressing high levels 
of tfie T305 antigen wad isolated and sequenced. 

Surprisingly, transfection using COS-1 cells rasulted In the isolation of a cDNA clone encoding a novel 
variant of human leukosialin. which Is the acceptor moiecule for core 2 ^1—6 N-acatylglucosaminyltrans- 
ferase activity, examination of the cDNA sequence of the newly isolated leukosialin revealed the cDNA 

15 saquenea was formed as a result of alternative spJicing of axons In the genomic leukosialin DNA aequance. 
Specifically, the newly isolated leukosialin is encoded by cDNA sequence containing a previously un- 
described non-coding axon at the S'-tarminuS (exon l* in Figure 2; SEQ. ID. NO. i). 

The unexpected result obtained using COS-1 cells lad to the development of a new transfection system 
to isolate a cDNA sequence encoding core 2 ^1-*6 A/-acetylglucosaminyrtransferase. CHO celte, which do 

so not normally express the T30S antigen, were transfected with ONA sequences encoding human leukosialin 
and the polyoma virus large T anligen. A cell line, designated CHO-Py-leu» which expresses human 
laukosiaiin and polyoma virus large T antigen, was isolated. 

CHO-Py-leu cells were used for transient expression cloning of a cONA sequence encoding core 2 
^1-^ W-acetylglucosaminyltransferase. CHO-Py-leu cells were transfected with cDNA obtained from human 

2S HL-60 promyelocytes A plasmid, pcDNAI-C2Gnl, which directed expression of the T305 antigen, was 
isolated and the cDNA insert was sequenced (sea Rgure 5: SEQ. ID. NO. 3). The 2105 base pair cDNA 
sequence encodes a putative 428 amino acid protein (SEQ. ID. NO. 4). tTie genomic DNA sequence 
encoding can be isolated using methods well known to those skilled in the art, such as nucleic acid 
hybridization using the core 2 51-*6 Macetylglucosaminyltransferase cDNA disclosed herein to screen, for 

30 example, a genomic library prepared from HL-SO promyelocytes. 

An ensyme similar to the disclosed human core 2 ijl-e A^-acetylgiucosaminyltransferase has been 
purified from bovine tracheal epithelium (Ropp et a?.. J. Biol. Chem. 266:23863-23871 (1991). which is 
incorporated herein by reference. The apparent molecular weight of the bovine enzyme is -BSkDa. In 
comparison, the predicted molecular weight of the polypeptide portion of core 2 /V-acetyl- 

3S glucosaminyltransferase is -SOkDa. The deduced amino add sequence of core 2 /91--6 M^acetyl- 
glucosaminyltransferase reveals two to three potential f^glycosylailon sites, suggesting AA^iycosylation and 
O-glycosylation. or other post-translational modification, could account for the larger apparent size of the 
bovine enzyme. 

Expression of the doned C2GnT sequence, or a fragment thereof, directed fomiation of the specific O- 

40 glycan core 2 oligosaccharide structure. Although several cDNA sequences encoding glycosyltransferases 
have been isolated (Paulson and Coltey. J. Biol. Chem. 264:17615-17618 (1989); Schachter, Curr. Opin. 
Struct. Bid. i:7S5-765 (1991), which are Incorporated herein by reference), C2GnT is the first repoited 
cDNA sequence encoding an enzyme involved exclusively in O^lycan synthesis. 

In O-glycans, /?1-^ A/-acetylglucosaminyl linkages may occur in both core 2, Gal/91*^3(GlcNAc/}1-»6)- 

45 GalNAc. and core 4, GlcNAc($1-3(GlcNAcili-6)GalNAc. stiuciures (Brockhausen et al.. Biochemistry 
24:1866-1874 (1985), which is incorporated herein by reference. In addiBon, A^acelylglucosaminyl 
linkages occur in the side chains off poiy-A^-acetyllactosamine, forming the l-structuro (Prlier et al.. J. Bio). 
Chem. 259:13385-13380 (1984), which is incorporated herein by reference), and rn the side chain attached 
to a-mannose of the N-glycan core structure, forming a tetraantennary saccharide (Cummlngs et al„ J. Biol. 

so Chem. 257:13421-13427 (1982). which is incorporated herein by reference). The enzymes resporisible for 
these linkages all share th© unique property that mvP+ is not required for their activity. 

Although It was originally suggested that these ^1—6 W-acetylglucosamlnyl linkages were formed by 
the same enzyme (Filler at al,. 1984). the present disoosure cieariy demonstrates that the HL-60-derived 
core 2 ^1—6 AZ-acaiylglucosamlnyllransferase is specific for the formation only of O-glycan core 2. This 

65 result is consistent with a recent report demonstrating that myeloid cell lysales contain the enzymatic 
activity associated with core 2. but not core 4, formation (Brockhausen et al.. supra, (1991)). 

Analysis of mRNA isolated from colonic cancer cells indicated core 2 /51-*$ N^acetylglucosamlnyltrans- 
ferase la expressed in these cells. Recent studies using affinity absorption suggested at least two different 
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^l-*6 A/-acetylglucosaminyltransferases were present in tracheal epithelium (Ropp et ai., supra , (1991)). 
One of thase iransferases formed core 2. core 4, and i structures. Thus, at !east one other /9l-*6 
acetylglucosaminyltransferase present in epithelial ceUs can form core 2. core 4 and \ structures. Similarly, 
a ^1-^ A/-acetylglucosaminynrdn$rerase present in Novikoff hepatoma cells can form bolh core 2 and I 
6 Structures (Koenderman ot al., Eur. J. Biochem. 166:199-200 (1987), which is incorporated herorn by 
reference). 

The acceptor molecule specificity of core 2 /flW 6 A^-acetylglyco^aminyltransferase is different from the 
specificity of the enzymes present in tracheal epithelium and Novikoff hepatoma ceils. Thus, a family of 
^1— 8 AZ-acetylglucoseminyltransferases can exist, the members of which differ in acceptor specificity but 
70 are capable of fbmDlng the same Bnkage. Members of this family are isolated from cells expressing /?l-^6 
AAacetylgiucosaminyltransferase activity, using, for exam^e. nucleic acid hybridization assays and studies 
of acceptor molecule specificity. Such a family was reported (or the «l-*3 fucosyltranaferBses (Weston et 
al., J. Bioi. Chem. 267:4152-4160 (1992), which is incoiporated herein by reference). 

The formation of the core 2 slmcture is critical to cell structure and function. For example, the core 2 
IS structure Is essential for elongation of poly-A/-ac©tyllecto$amine and for formation of sialyl Le'^ or sialyl L^* 
staidures. Furthermore, the t>iOSynthesie of cartilage keratan sulfate may be Initiated by the core 2 
AAacetylglucosaminyltransferase. since the keratan sulfate chain is extended from a branch present in core 
2 structure in the same way as poly«N-acetyllaclosamine (Dickenson el al., Biochem. J. 269:55-59 (1990), 
which is incorporated herein by reference). Karatan sulfate Is absent In wild-type CHO cells, which do not 
20 express the core 2 6 AZ-aceiylglucosaminyltransferase (Esko et al.. J. Biol. Chem. 201:1 5725- 15733 
(1986). which is Incorporated herein by reference). These structures are t)elieved to l>e Importam for cellular 
recognition and matrix formation. The availability of the cDNA clone erK:odIng the core 2 /91— -6 N- 
acetylglucosamlnyltransforase wid aid in understanding how the various carbohydrate structures are formed 
during differentiation and malignancy. Manipulation of the expression of the various carbohydrate structures 
«5 by gene transfer and gene inactivation methods will help elucidate the various functions of these structures. 
The present invention is directed to a method for transient expression cloning in CMO cells of cDNA 
sequences encoding proteins having enzymatic activity. Isolatton of human core 2 tfl— 6 AF-acetyl- 
glucosaminyltranslerase is provi<fed as an example of the disclosed method- However, the method can be 
used to obtain cDNA sequences encoding other proteins having enzymatic activity. 
30 For example, lectins and antibodies reactive with other specific Oligosaccharide structures are available 
and can be used to screen for g lycos yllransf erase activity. Also, CHO cell lines that have defects in 
glycosylation have been isolated. These cell lines can be used to study the activity of the con-esponding 
glycosyltransferase (Stanley, Ann. Rev. Genet, 18:525-562 (1904), which is incorporated herein by refer- 
ence), CHO cell lines also have been selected for various defects in cellular metabolism, loss of expression 
35 Of cell surface molecules and resistance to cytotoxic dmgs (see, for example. Malmstrom and Krieger. J. 
Biol. Chem. 266:24025-24030 (1991): Yayon et al., CeH 64:641-846 (1991). which are incorporated her^n by 
reference). The approach disclosed herein shouM allow isolation of cDNA sequences encoding the proteins 
involved in these various cellular functions. 

As used herein, the terms "purified* and "isolated" mean that the molecule or compound is substan- 
ce? tially free of contaminants normally associated with a native or natural environment For example, a purified 
protein can be obtained from a number of methods. Tho naturally-occurring protein can be purified by any 
means known in the art. including, for example, by affinity purification with antibodies having specific 
reactivity with the protein. In this regard, anti-core 2 /Jl-*e A^acelylglucosaminyltransferase antibodies can 
be used to substantially purify naturally-occurring core 2 ifll-^ A^aceiylgtucosaminyltransfarase from 
45 human HL*60 promyalocytes. 

Alternatively, a purified protein of the present invention can be obtained by well Icnown recombinant 
methoos. utilizing the nucleic adds disclosed herein, as descrit)ed. for example, in Sambrook et al., 
h/lolecular Cl oning: A Laboratory Manual 2d ed. (Cold Spring Harbor Uboratory 1989). which is Incorporated 
herein by reference, and by the methods described In the Bcamples below. Furthennore. purified proteins 
so can be synthesized by methods well known in the art. 

As used herein, the phrase "substantially the sequence" Includes the described nucJeotlda or amino 
acid sequence and sequences having one or more ad^fitions. deletions or substitutions that do not 
subsianilally affect the ability of ttie sequence to encode a protein have a desired hinctionai activity. In 
addition, the phrase encompasses any additional sequence ttiat hybridizes to the disclosed sequence under 
66 stringent hybridization sequences, f^ethods of hybridization are well known to those skilled in the art For 
example, sequence modifications that do not substantially alter such activity are intended. Thus, a protein 
having 8ut>stantially the amino add sequence of Rgure 5 (SEQ. ID. NO. 4) refers to core 2 /)1-*6 
acetylglucosaminyltransferase encoded by the cDNA described in Example IV. as well as protons having 
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amino acid sequences thai are modined hut, navertheless. retain the functions of core 2 ^1—6 N' 
acetylglucosaminyliransforaso. One skilled in the art can readily determine such retention ot function 
following the guidancd $et forth, for example, in Examples V and VI. 

The present invention is further directed lo active fragmonts of the human core 2 fi\-^Q N- 

9 acetyiglucosamrnyKransferase protein. A$ u$ed herein, an active tragrnent refers to portions ot the protein 
tliat sutsstantially retain the glycosyltransferase activity of the iniaci core 2 ^1—6 AA-aceiylgluco$arninyltrans- 
f erase protein. One skilled in the art can readily identify active fragments of proteins ^ch as core 2 ^1—6 
A^aoetylglucosaminyHransferase by comparing the activities of a selected fragment with the intact protein 
foltowing the guidar^ce set forth in the Examples t>elow. 

TO As used herein, the term "glycosyttrandferase activity" refers to the function of a glycosyltransferase to 
link sugar residues together through a glycosMic bond to create criitcal branches in orrgosdccharides. 
Giycosyftransferase activity results in the specific transfer of a monosaccharide to an appropriate acceptor 
molecule, such that the acceptor molecule contains oligoeaccharidds having critical brancties. One skilled in 
the art would understand the terms "enzymatic activity" and "cat^ytic activity* to generally refer to a 

95 function of certain proteins, such as the function of those proteins having glycosyltransferase activity. 

As used herein, the term "acceptor molocule** refers to a molecule that is acted upon by a protein 
having enzymatic activity. For example, an acceptor molecule, such as leukosialin, as Identified by the 
amino acid sequence of Rgure 2 (SEQ. ID. NO. 2). accepts the transfer of a monosacchande due to 
glycosyltransferase activity. An acceptor molecule, such as leukosialin. may already contain one or more 

20 sugar residues. The transfer of monosaccharides to an acceptor molecule, such as leukosJafin. results in the 
formation of critical txenches oj oligosaccharides. 

As used herein, the rami "critical branches" refers to oligosaccharide structures formed by specific 
glycosyltransferase activity. Critical brar^es may t^ involved in various cellular functions, such as cell-cell 
recognition. The oligosaccharide structure of a critical branch can be determined using methods well known 

26 in the an. such as the method for determining (he core 2 oligosaccharide structure, as described in 
Examples V and VI. 

Belatedly, the invention also provides nucleic acids encoding the human core 2 51—6 W-acetyl- 
glucosaminyltransferase protein and leukosialin protein described above. The nucleic acids can be in the 
form of DNA, RNA or cONA, such as the novel C26nT cONA of 2105 base pairs identified in Rgure 5 

QO (SEQ. ID. NO- 3) or the novel leukosialin cDNA identified in Figure Z (SEQ, ID. NO. 1), for example. Such 
nucleic acids can also be chemically synthesized by methods known in the art. including, for exan)plQ, the 
use of an automated nucleic add synthesizer. 

The nucleic acid can have substantially the nucleotide sequence of C2GnT, identified in Figure 5 (SEQ. 
10. NO. 3). or leukosialin identified fn Rgure 2 <SEO. ID. NO. 1). Portions of such nucleic acids that encode 

9S active fragments of the core 2 ^1-^6 A^acetylglucosamlnyKransferase protein or leukosialin protein of the 
present invention also are contemplated. 

Nucleic acid probos capat>le of hybridizing to the nucleic adds of the present invention under 
reasonably stringent conditions can be prepared from the cloned sequences or by synthesizing 
oligonucleotides by methods known in the art. The probes can be labeled with markers according to 

40 methods known in the art and used to detect the nucleic acids of the present invention, r^ethods for 
detecting such nudeic adds can be accompHshed by contacting the probe with a sample containing or 
suspected of containing the nucleic add under hybridizing conditions, and detecting the hybridization of the 
protje to ttie nucleic acid. 

The present Invention is further directed lo vectors conlaining \he nudeic acids described atjove. The 
4$ term "vector" indudes vectors that are capable of expressing nucleic acid sequences operably linked to 
regulatory sequences capable of effecting their expression. Numerous cloning vectors are Icnown in the art. 
Thus, the selection of an appropriate donlng vector is a matter of choice, in general, useful vectors for 
recombinant DNA are often piasmlds, which refer to circular double stranded DNa loops such as pcDNAt or 
pcOSFW. as used herein, "plasmld" and "vector" may be used interchangeably as the plasmid is a 
50 common form of a vector. However, the Invention is intended to include other forms of expression vectors 
that serve equivalent functions. 

Suitable host cells containing the vectors of tiie present invention are also provided. Host cells can be 
transfonned with a vector and used to express the desired recombinant or fijsion protein, iviethods of 
recombinant expression in a variety of host cells, such as mammalian, yeast, insect or bacterial cells are 
55 widely known. For example, a nucleic acid encoding core 2 fi^-^e A^acetylglucosaminyHransf erase or a 
nucleic acid encoding leukosialin can be transfected Into cells using ihe caldum phosphate technique or 
other transfection methods, such as those described in Sambrook et al.. supra . (1969). 
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Alternatively, nucleic acids can be iniroduCdd inio cells by infection with a retrovirus carrying the gene 
or genes of interest, For example, the gene can be cloned into a plasmrd containing retroviral long terminal 
repeat sequences. ti>e C2Gnt DNA sequence or the leukoslalin DNA sequence, and an antibiotic resistance 
gene for selection. The construct cen then be transfected into a suitable cell line, such as PAi2, which 
5 carries a packaging deRcient provlrus and expresses Uie necessary components for virus production, 
including synthesis of amphol/ophic glycoproteins. The supernatant from these cells contain Infectious 
virus, v/hlch can ba usad to infect Ihe cells of interest- 
Isolated recombinant polypeptides or proteins can be obtained by gro>ving the descrltsed host cells 
under conditions that favor transcnption and translation of the transfected nucleic acid. Recombinant 
10 proteins produced by the transfectdd host cells are leoldted using methods set forth herein and by methods 
well known to those skilled In HhB art. 

Aleo provided are antibodies having specific reactivity with the core 2 ^1^6 ^^M^tylglucosamtnyitrans- 
ferese protein or foukosialin protein of the present invention. Active fragments of antibodies, for example, 
Fab and Fab2 fragments, having specific reactivity with such proteins are intended to fail within the 
rs deHnition of an *'dntibcdy." Antibodies exhibiting a liter of at ieast about l.S x 10^. as dotermined by BLiSA, 
are useful in the present invention. 

The anttt>odies of the invention can be produced by any method known in the art. For example, 
pclydonat and monoctonal antibodies can be produced by methods doscribod In Harlow and Lane. 
AntitK>die5; A Laboratory Menual (Cold Spnng Harbor 1988). which is incorporated herein by reforonce. The 
20 proteins, particularly core 2 ^i-*6 AZ-aoetylglucosamlnyltransferase or leukosialin of the present invention 
can ba used as immunogens to generate such antibodies. Altered antilKkdies. such as chimeric, humanized. 
COR-grafted or bifunctk^nal antibodies can also i3e produced by methods well kpown to those skilled in the 
art. Such antibodies can also be produced by hybridoma. chemical synthesis or rscombinant methods 
described, for example, in Sambrook et al.. supra . (1989). 
25 The antibodies can be ussd for determining the presence or purification of the core 2 01-«6 ^ 
doetylglucosamlnyltransferase protein or the leukosialin protein of the present invention. With respect to Ihe 
detecting of such proteins, the antikMKlies can be used for in >^tro or in vivo methods well known to those 
Skilled in the art. ^ 

Finally, kits useful for carrying out the methods of the invention are also provided. The kits can contain 
30 a core 2 ^91 ^6 N*acetylgIucosaminyltransferase protein, antitwdy or nucleic add of the present invention 
and an ancillary reagent Alternalively, the kit can contain a leukosialin protein, antibody or nucleic acid of 
ttie piresent invention and an ancillary reagent. An ancillary reagent may include diagnostic agents, signal 
detection systems, buffers, stabilisers, pharmaceutically acceptable carriers or otiier reagents and materials 
conventionally included in such kits. 
35 A cDNA sequence encoding core 2 /li— e A^acelylglucosaminyltransferase was isolated and core 2 
^1-^ Maceiylglucosaminyltransferase activity was determined. This is the first report of transient expres- 
sion cloning using CHO cells expressing polyoma large T antigen. The following examples are intended to 
illustrate but not limit the present invention. 

EXAMPLE I 

EXPRESSION CLONING IN COS-1 CELLS OF THE cDNA FOR THE PROTEIN C ARRYING THE HOC- 
ASACCHAftlDlS ■ " 

45 COS-1 calls were translecled with a cDNA library, pcOSRa*2F1, constructed from po!y(A)* RNA of 
activated T lymphocytes, which express the core 2 fii^ A^acetylglucosaminyltransferase (Yokota et al., 
Proc. Natl. Acad. Sci. USA 83:5884-5898 (198B); Filler et a!., Supra . (1988), which are incorporated herein 
by reference). COS-I cells support replication of the pcDSPWr constructs, which contain the SV40 replication 
origin. Transfected cells were selected by panning using monoclonal antibody T305, which recognizes 

so slalyiated branched hexasaccharides (Piller et al„ supra . (1991); Saitoh et al., supra . (1991)). i^ethods 
feierr^d to In this example are described in greater detail in the examples that foltow. 

Following several rounds of transfection, one plasmJd, pcDSR^-leu, directing high expression of the 
T305 antigen was identified. The cloned cDNA insert was isolated and sequenced, then compared with 
other reported sequences. The newly isolated cDNA sequence was nearly identical to the sequence 

ss reported for leukosialin. except the 5'-flanking sequences were different (Pallant et al., Proc. NaU. Acad. Sci. 
USA 86:1328-1332 (1989). which Is incorporated herein by reference). 

Comparison of the cloned cOfMA sequence with (he genomic leukosialin DNA sequence revealed the 
Stan site of tfie cDNA sequence is boated 250 bp upstream of the transcription start site of the previously 
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reported sequence (Figure 2: compare 6xon 1' and Exon 1) (Shaliey et al.. Blochem. J. 270:569-576 (1990); 
Kudo and FuKuda, J. Biol. Chem. 266:8483-8489 (1991), which are incorporated herein by reference). A 
consensus splice site was Identified at the exon-intron Junction of the newly identified 122 bp exon 1* m 
pcDSRa-leu (Breathnach and Chambon, Ann. Rev. Biochem. 50:349-383 (1881). which \s incorporated 

$ herein by reference). This splice site is rollowed by the exon 2 sequence. 

These results Indicate the T305 antibody preferentially binds to branched hexasaccharides attached to 
teukosialin. Indeed^ a small amount of the hexasaccharides (approximately of the total) was delected in 
O^glycans isolated from control COS-1 colls. T305 binding is similar to anti-M and anti-N antibodiea, which 
recognize both the glycan an6 pofypeplide portions of erythrocyta glycoprotein, glycophorm (Sadler el a!.. 

10 J. Biol. Ctiem 254: 2112-2119 (1979), which 19 incorporated herein by reference). These observations are 
conaistent with reports that only leukoeielin etrongiy reacted with T305 in Western blots of leulcocyte cell 
extracts, even though leukocytes also express other glycoproteins, such as CD45, that must also contain 
the same hexasaccharides (Piller et af., supra . (I99l); Seitoh et dJ.. supre . (1991)). 

75 EXAMPLE II 

ESTABUSHMENT OF CHO CELL UNES THAT STABLY EXPRESS POLYOMA VIRUS LARGE T ANTI- 
QEN AND LEUKOSIALfN ' ' 

SO T305 preferentially binds to branched hexasaccharides attached to leukosialin. Such hexasaccharides 
are not present on the erythropoietin glycoprotein produced in CHO cells, although the glycoprotein does 
contain the precursor tetrasaccharide (Sasaki et al., J. Biol. Qh&m. 262 :120S9-1 2076 (19B7), which is 
incorporated herein by reference). T305 antigen also is not deteclabia in CHO cells transiently transfected 
with pcDSRa-ieu. in order to screen for the presence of a cONA done expressing core 2 ^I'-B 

25 acetylglucosaminyltransferase activity, a CHO cell line expressing both leukosialin and polyoma large T 
antigen was established (see. for example. Heffeman and Dennis Nud. Acids Res. 19:85-92 (1991)1 which is 
incorporated herein by reference). 

Vectors: A plasmid vector, pPSVEl-PyE. which contains the polyoma virus earty genes under the control 
of' the SV40 earty promoter, was constructed using a modification of the method of fvlulier et a).. Mol. Celi. 

so Biol, 4:2406-2412 (1984), which is incorporated herein by reference. Plasmid pPSVEl was prepared using 
pPSQ4 (American Type Culture Collection 37337) and SV40 viral Dl^ (Bethesda Research laboratories) 
essentially as described by Featherstone et al.. Nud. Acids Res. 12:7235-7249 (1984), which Is incor- 
porated herein by reference. Folbwing EcoRI and Hindi digestion of plasmid pPyLT-1 (American Type 
Culture Coflection 41043). a DNA sequence containing the carboxy terminal coding region of polyoma virus 

35 large T antigen was isolated. The Hindi site wa$ converted to an EcoRI site by blunt-end ligation of 
phosphoryialed EcoRI linkers (Stratagene), Plasmid pPSVEI-PyE was generated by inserting the carit)oxy- 
terminal coding sequence for iarge T antigen into the unique EcoRI site of plasmid pPSVEl. 

Plasmid pZIPNEO-leu was constructed by introducing the EcoRI fragment of PEER-3 cDNA, wNch 
contains the complete coding sequence for human leukosialin, Into the unique EooRI site of plasmid 

40 pZIPNEO (Gepko et al.. Cell 37:1053-1063 (1984), which t$ incorporated herein by reference). Plasmid 
structures were confirmed by restriction mapping and by sequencing the construction sites. pZlPNEO was 
kindly provided by Dr. Channing Der. 

TransfecUon; CHODG44 cells were grown in 100 mm tissue culture plates. When the cells were 20% 
confluent, they were co-lransfected with a 1:4 molar ratio of pZIPNEO-leu and pPSVEl-PyE using the 
45 caldum phosphate technique (Graham and van der Bb. Virology 52:456-467 (1973), which is Incorporated 
herein by reference). Transfected cells were isoiated and maintained in medium containing 400 itg/rnf 0- 
418 (aclive dmg). 

Letikoslalln expression: The total pool of G41frTesi5tant transfectants was enriched for human leukosialin 
expressing ceils by a one-step panning procedure using anti-leukosialin antibodies and goat anti-rabbit IgG 
so coated panning dishes (S«gma) (Carlsson and Fukuda J. Biol. Chem. 26i: 12779- 12786 (1986), which Is 
Incorporated herein by raference). Clonal cell lines were obtained by limiting dilution. Six donat cell lines 
expressing human leukosialin on the cell surface were identified by indirect Immunofluorescence and 
isolated for further studies (Williams and Fukuda J. Cell Biol. 111:955-968 (1990). which is incorporated 
herein by reference). 

®* Polyoma vir us^edlated replication: The ability ol the six donal cell Ones to support polyoma virus large 
T antigen-mediated replication of pfasmids was assessed by determining the methyiation status of 
transfected plasmids containing a polyoma virus origin of replication (fy^uller at al., supra . 1984: Heffernan 
end Dennis, supra , 1991). Plasmid pGT/hCG contains a fused flX^A galactosyltransferase and human 
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chorionic: gonadotropin a*chain DNA sequence Inserted in plasmid pcDNAI, which contains a polyoma vims 
replication origin <AoW et aJ., Proc. Natl. Acad, Sci>, USA 69. 4319-4323 (i992). which is incorporated herein 
by reference). 

Plasmid pGT/hCG was isolated from melhylase-positive E, coli strain MC10B1/P3 (Invilrogen), which 
5 methylatee the adenine residues in the Dpnl recognition site. "GATC". TTie methylated Dpnl recognition site 
is susceptible to cleavage by Dpnl. In contrast, the Dpnl recognition Site of pfasmids replicated in 
mammalian cells is not methylated and, therefore, is resistant to Dpnl digestion. 

Methylated plasmid pGT/hCG was transfected by lipofection Into each of the six selectad clonal cell 
. lines expressing leukosialin. After 64 hr. low molecular weight plasmid DNA was isolated from the csfis 
JO using the method of Hirt, J. lAol Biol. 26:365-369 (1967). which is incorporated herein by reference. Isolated 
plasmid DNA was digested with Xhol and Dpnl (Stratagene), subjected 1o electrophoresis in a 1% agarose 
gel. and transferred to nylon membranee (Micron Separations inc., MA). 

A 0.4 kb Smal fragment of the i9i-»4 galactosyltransfarase DNA sequence of pGT/hCG was radiolabel- 
ed with I^^JdCTP using the random primer method (Feinberg and Vogelstain, Anal, Biochem. 132:6-13 
IS (1983), which is Incorporated herein by reference). Hybridization wes perforn^ed using methode well-known 
to those skilled in the art (see, for exampia, Samt^ook et al., supra, (1989)). Following hybridization, the 
membranes were washed several times, including a final high stringency wash in 0,1 x SSPE, 0.1% SOS for 
1 hr at 65 * C, then exposed to iCodak X-AR film at -70 • C 

Four of the six clones tested supported replication of tha pcDNAI-based plasmid, pGT/hCG (Fig. 3A.. 
20 lanes 1, 3, 4 and 5)- f^OP**B cells, a 3T3 cell line transformed by polyoma vlois eariy genes (duller et al., 
sugra, (1964)). expresses endogenous core 2 $t^6 Macetylglucosaminyltransterase activity and was used 
as a control for the replication assay (Fig. 3.B.. lane 1). One clonal call line that supported pGT/hCG 
replication, CHO-Py-leu (Fig. 3.A-. lane 5; Frg. 3.6.. lanes 2 and 3) and expressed a significant amount off 
leukosialin. was selected for further studies. p(3T/hCQ was kindly provided by Dr. Mlchlko Fukuda. 

25 

EXAMPLE IH 

ISOLATION OF A cDNA SEQUENCE DIRECTING EXPRESSION OF THE HEXASACCHABIDE ON 
LEUKOSIAUN I 



PoIy(A)'^ HNA was isolated from HL-eO promyetocyies. which contain a significant amount of the core 2 
^l^BAAacetylglucosaminyltransferase (Saitoh et al., supra , (1991)). A cDNA expression library, pcDNAi-HL- 
60. was prepared (invitrogen) and the library was screened for clones directing the expression of the T305 
antigen. 

35 Plasmid DNA from the pcONAi-HL-60 cDNA library was transfected into CHO-Py-leu cells using a 
modification of the Jipofection procedure, described below (Feigner et ai.. Proc- Natl. Acad. Sci. USA 
84:7413-7417 (1967), which Is incorporated herein by reference). CHO-Py-leu cells were grot^n in 100 mm 
tissue culture plates. When the cells were 20% confluent, they were washed twice with opti-MEfU I 
(QIBCO). Fifty ug of lipofectin reagent (Bethesda Research Laboratories) and 20 ug of purified plasmid 

40 DNA were each diluted to 1.5 ml with Opti-MEM I. then mixed and added to the cells. After incubation for 6 
hr at 37' C. the medium was removed. 10 ml of complete medium was added and Incubation was conGnuad 
for 16 hr at 37 • c. The medium was then replaced with iO ml of fresh medium. 

f=ollowing a 64 hr period to allow transient expression of the transfected plasmids, the cells were 
detached in PBS/SmM EDTA, pH7.4. for 30 min at 37 -C. pooled, centrifuged and resuspended In cold 

45 PBS^Oml^ EDTA/5% fatal calf serum, pH7.4, containing a 1:200 dilution of ascites fluid containing T305 
monockinal antibody. The cells were Incubated on k:e lor 1 hr, then washed in the same buffer and panned 
on dishes coated with goat anti-mouse igG (Sigma) (Wysocki and Sato Proc. Nati. Acad. Sci. USA 75:2844- 
2848 (1978); Seed & Aojffo Proc. Natl. Acad. Sci> USA 64:3365-336$ (1987). which are incorporated herein 
by reference). T305 monoclonal antibody was kindly provided by Dr. RJ. Fox, Scrjpps Research Founda- 

60 'tion. La JoHa, CA. 

Plasmid DNA was recovered from adherent cells by the method of Hirt. supra . (1967). treated with Dpnl 
to eliminate plasmids that had not replicated In transfected cells, and transfomted into E. coif strain 
MC1061/P3. Plasmid ONA was then recovered and subjected to a second round of screening. E. coli 
transformanis containing plasmids recovered from this second enrichment ware plated to yield 6 pools of 
55 approximately 500 colonies each. Replica piaies were prepared using methods well-known to those skilled 
in the art (see. for example, Sambrook et al., supra , (1069)). 



The pooled plasmid DNA was prepared from replica plates and transfected into CHO-Py-leu cells. The 
transfectants were screened by panning. One plasmid pool was selected and subjoctad to three subsequent 
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rounds of seiecUon. One plasmld. pcDNAI-C2GnT. which direcled the expre$$lon of the T30S antJgen, was 
isolated. CHO-Py-!eu cells transfected with pcONAl-C2GnT express the antigen recognized by T305. 
whereas CHO-Py-leu cells Iransrected with pcDNAI are negative for T305 anUgen (Rg. 4), These results 
show pcDNAl-C2GnT directs the expression of a new determinant on leukoslalin that is recognlMd by T305 
s nruwioclona! antibody. This daierminant is the branched hexaaaccharide sequence, NeuNAco2-^3Gali9l'*3- 
(NeuNActt2^aGal^1-'4 QlcNAc^1^)GallMAc. 

EXAMPLE tV 

10 CHARACTERIZATION OF 02Gi>T 

PNA sequence: The cDNA insert In plesmid pcDNAl-C2enT was sequenced by the dideoxy chain 
tenmination method using Sequenase version 2 reagents (UnH»d Stated Btoohemicals) (Sanger et al.. Proc. 
Natl. Acad. Sd. USA 74:5463-5467 (1977), which is Incorporatad herein by reference). Both strands were 
IS sequenced using i7-mer synthetic oligonucleotides, which were Synthesized as the sequence of the cOna 
insert kiecame known. 

Plasmid pcDNAI-C2GnT contains a 2105 base pair insert (Rg, 5). The cDNA sequence (SEQ. ID. NO. 3) 
ends 1878 bp downstream of the putaUve translation start site. A polyadenylation signal Is present at 
nucleotides 1694-1699. The signrficance of the large number of nucteotides between the polyadenylation 
20 signal and the beginning of the polyadenyl chain Is not clear. However, this sequence is AT rich. 

Deduced a mino acid sequence; The cONA insert in plasmid pcDNAI-C2GnT encodes a single open 
reading frame in the sense orientation with respect lo the pcDNAI promoter <Rg. 5). JhQ open reading 
frame encodes a putative 42a amino acid protein having a molecular mass of 49.790 daltons. 

Hydropathy analysis indicates the predicted protein is a type II transmembrane molecule, as are all 
previously reported mammalian glycosyltransferases (Schachter. supra . (1991)). In this topology, a nine 
amino acid cytoplasmic NH^-terminal segment Is followed by a 23 amino acid transmembrane domain 
flanked by basic amino acid residues. The large OOOH-terminus consists of the stem and calalyttc domains 
and presumably faces the lumen of the Golgi complex. 

The putative protein contains three potential AAglycosyladon sites (Rg. 5. asterisks). However, one of 
these sites contains a proline residue adjacent to asparagine and is not likely utilized in vivo. 

No matches were obtained when the C2GnT cDNA sequence and deduced amino aSd sequence were 
compared with sequences listed In the PC/Gene 6.6 data bank. In particular, no homology was revealed 
between the deduced amino add sequence ol C2GnT and other glycosyltransferases. including A/- 
acetylglucosaminyltransferase I (Sarkar et al.. Proc. Natl. Acad. Sci. USA 8a234'233 (I99i), which is 
3S incorporated herein by reference), ' 

mRNA expression: Poiy(A)* RNA was prepared using a kit (Stratagena) and resolved by electrophoresis 
on a 1^% agarose>2.2 M formaklehyde gel, and transferred to nylon membranes (iWicro Separations Inc 
MA) using methods well-known to those skilled in the art (see. for example. Sambrook et al.. supra. (19B9)).' 
I^mbranes were probed using the EcoRI insert of pPROTA-C2GnT (see below) radiolabelid~with p^Py 
40 dCTP by the nandom priming method (Felnberg and Vogelslein, supra . (1983). HybridizaUon was performed 
in buffers containing 50% formamide for 24 hr at 42 (Sambrook et al., eupra , (1989)}. Foltewing 
hybrkliation, filters were washed several times in txSSPEA).1% SDS at room temperature and once in 
0.ixSSPE/0.l% SDS at 42 • then exposed to Kodak X-AR film at -70 'C. 

Rg. 6 compares the level of core 2 ^1-6 N- acetylglucosaminyltransferase mRNA isolated from HL-60 
4S promyekicytes. K562 erylhioleukemia cells, and poody metastatic SP and highly metastatic L4 colonk: 
carcinoma cells. The major RNA species migrates at a size essentially identical to the -2 1 kb C2GnT 
cONA sequence. Ihe same result Is observed for HL-60 cells and the two colonic cell Knes. which 
apparently synthesize the hexasaccharides. In addition, two transcripts of -3.3 kb and S.4 kb In size %vera 
detected in these cell lines. The two larger transcripts may result from differential usage of oolvadenvlatlon 
so signals. 

No hybridization occurred with poly(A)* RNA Isolated from K562 cells, which lack the hexasaccharlde 
but synthesize the teirasaccharide (Carlsson at al.. supra . (1886)). which is incorporated herein by 
reference. Similarly, no hybridization was observed for poly(A)* RNA Isolated from CHO-Py-leu cells (Rq 6 
lane 1). # \ » • 
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EXAMPLE V 

EXPRESSION OF ENZYMATICALLY ACTIVE />1^6 /V-ACETYLGLUOOSAMINYLTRANSFERASE 

5 In order to confirm thai C2GnT cDNa er^codes for core 2 ^91—0 A^-^cetylgluoosam^nyltransferase. 

enzymatic activity was examinad in CHO-Py-teu cells transfected with pcDNAI or pcDNAI-Q2GnT. Following 

d 64 hr period to allow tran^nt expression, cell lysates were prepared and core 2 i9i-»6 M^-acetyl- 

QlucosamlnyHransrerase activity was measured. 

A^acetylQlucosaminyltransferasa assays were performed essentially as descrit>ed by Saitoh et a!.. 
10 supra , (1991), Yousefi et al., supra . (1991). end Lee et al.. J. Biol. Cham. 266:20476-20487 (1990). which Is 

frKX>rporated herein by reference. Each reaction contained 50 mM MES, pH7.0. 0.5 uCi of UDP-[3H]GlcNAc 

In 1 mM UDP-GlcNAc. 0.1 M GicNAc. 10 mM NasEDTA, ImM of acceptor and 25 U.I Of either celi lyaaie. 

celf aupematant or IgG-Sepharose matrix in a total reaction volume of 50 uL 

Reactione were Incubated for 1 hr at 37 • C, ther) processed by CI B 8ep-Pak chromatography (Waters) 
75 (Palcic at aJ., J. Biol. Chem. 265:8759-6769 (1990). which is incorporated herein by reference). Core 2 and 

cor© 4 ^i-*6 AZ-acetylglucoaaminylh-ansPerase were assayed using the acceptors p*nitrophenyl Qal^1-*SGah 

NAc and p-nltrophenyl GlcNAc^l-^aGalNAc, respectively (Toronto Research Chemicals). 

UDP-GlcNAcia-Man ^1-»6 A^dcetylgfuco5aminyl(ransfera5e(V) was assayed using the acceptor 

GlcNAc/Jl^2Manfii1-^lc-i9-0-(CH2)7CH3, Th© blood group I enzyme, UDP- 
2o QlcNAc:GlcNAc^1-^3Gal^1--'4GicNAc (GlcNAc to Gal) ^1-6 A^acetylglucosaminyltransferase. was assayed 

using GlcN Ac/? 1 -*3Gal^ 1 '»4GlcNAc^1 -»6Marto 1 -»6Man & l ^0-(CH2 ;i COOCHa or 

Gal^l— 4GlcNAc^1— 3Gal^1-'4GlcNAc^1— 3Gal/!l-'4GlcNAc^WO-(CHa)7CH3 as acceptors (Gu et al.. J. 

Biol. Chem. 267:2994-2999 (1992). which is incorporated herein by refiarence)« Synthetic acceptors were 

kindly provided by Dr. 016 Hindsgaul, University of Alberta. Canada. 
2S Results of these assays are shown in Table 1. Assuming transfection efficiency of the ceils is 

approximately 2O-30%. the level of enzymatic activity directed by cells transfected wfth pcDNAI-C2GnT is 

roughly equivalent to the level observed in HL-60 cells. 
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In ordar io unequivocally establish that C2GnT cDNA sequence er^ooes cor© 2 /3i-*e A/-acetyh 
glucosaminyllransferas©, plasmid, pPR0TA-C2GnT was constructed containing the DNA sequence encod- 
ing the putative Cdtdlydc donnain or core 2 ^1~*0 N-acetylglucosaminyltransferase fused in frame with ihe 
5 signal peptide and lg(3 binding domain of 5. aureus protein A (Rg. 7). Ttie putative catalytic domain is 
contained In a 1330 bp fragment of the C2QnT cDNA that encodes amino add residues 3B to 428. f^lasmld 
pPROTA was kindly provided by Dr. John B. Lowe. 

The polymerase chain reaction (PCn) was used to Insert EcoRI recfijgrnltlon sites on either side of the 
1330 bp sequence in pcDNAK^GnT DNA. PCR was performed using the Synthetic oligonucleotide primers 
10 5'-riTGAATTCCCCTGAATTTQTAAGTGTCAGACAC-3* (SEQ. ID. NO. 5) and 5'- 
TT TGAATTC GCAGAAACCATGCAGCTTCTCTGA-3' (SEQ. ID. NO. 6) (EcoRI recognition sites underlined), 
Tho EcoRI sites allowed direct, in -frame insertion of the fragment Into the unique EcoRI site Of ple^mid 
pPROTA (Sanchez-topez dt al.. J. Biol. Chem. 263:11892-11899 (1988). which Is Incorporated herein by 
rdferenoo). 

fs The nucleotide sequence of the Insert as well as the proper oridAtation w©r© confirmed by DNA 
sequencing using the pHmers descrit^ed aljove for cDNA sequencing. Plasmid pPROTA-C2GnT allows 
secretion of the fusion protein from transfected cells and bindirig of the secreted fusion protein by 
insolutM'lizdd immunoglobulins. 

Either pPROTA or pPR0TA-C2QnT was transfected into COS-1 ceils. Following a 64 hr period to allow 

^ transieni expression, cell supematants wore collected (Kukowska-Latallo et al.. supra . (1990)). Cell super- 
natants were cleared by centHfugation, adjusted to 0.05% Tween 20 and either assayed directly for core 2 
$^-^6 ^/-acetylgtuCO$aminyltransferase activity or used in IgQ-Sepharosa (Pharmacia) binding studies, for 
tha latter assay, supematants (10 ml) were incubaled batchwise with approximately 300 ul of IgG- 
Sepharose for 4 hr at 4'C. The matrices were then extensively washed and used directly for glycosyltrans- 

25 ferase assays. 

No core 2/71-^ AAacetytglucosamlnyltransFerase activity was detected in the medium of OOB-i cells 
transtected with the control plasmid, pPROTA. Similarly, no enzymatic activity was associated with IgG- 
Sepharosa beads. In contrast, a significant level of core 2 ^1--^ ^^^C6tylgluco$^mlnyltransrera$e activity 
was detected in the medium of COS-1 cells transfected with pPRQTA-C2GnT. The activity also associated 
30 with the igG-Sepharose beads (Table II). No activity was detected in the supernatant following incubation of 
the supernatant with IgG-Sepharose. 
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TABLE IX 

Datarmiiiat.ioxi of Enzymatic Activities Directed fav 
pPRaXA-C2GnX. 



Acceptors and 
- linkages fozmed 



Radioactivity ( cpm) 
with (-I-) and witliout. 
(-) acceptor 



IS 



GlcNAoBl 



GalBl-»3GalNAc 
(coz-e 2-GnT) 



109 



1048 



20 



GlcHAeBl 

6 

GlcMAcfi l-^3GalHAO 
(core 4'GnT) 



111 



113 



25 



00 



9S 



GlcNAcBl 

6 

GlcNAci»l'>2Han 
(GnTV) 



6 

GlcilAc£l-»3Gal 
(X-GnT) 



lie 



115 



111 



113 



40 



GlcNAcBl 

6 

Galfll^4Glctiacfll-»^3Gal 
(1-GnT) 



99 



96 



60 



C0S*1 cella were transfected with pPROTA-C2GnT and the 
condxtxoned media were incubated with IgC-Sephorose. The 
proteins bound to the IgG-Sepbarose were aeeayed for Bl-»6 
N^acetylglucoHaminyltran&feraBe activity by aaing 
appropriate acceptors. The linkages formed are indicated 
?y it:alxcs. Sjbailar results were obtained in three 
a.ndependent experiments. 
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EXAMPLE VI 

DETERMINATION OF C2QnT SPECinCtTV 

s Four types or ^1->^ /^Cdtylglucos&nijnyltransreraso linkages have been reported, including core 2 and 
core 4 in 0-glycans, r-anliQen and a branch attached to mannose that forms totraantennary ##-Qlycan$ (see 
Tabia II). In order lo determine whether the$9 different structures are also synthesized by the clor^ed C2QnT 
cONA sequence, enzymatic activity was determined using five different acceptors. 

As shown' in Table II, (Kie fusion protein was only active with the acceptor for core 2 formation. The 

10 same was true when the formab'on of AZ-acelylglucosaminyl Onkage to internal galactose residues was 

examined (Table II, see structure at tx>t(om). This result precludes tho likelihood that the enzyme encoded 
by the C20nT cDNA sequence may add A/-dcetylgluco$amlne to a non-reducing tormina) galactose. The 
HL*60 core 2 A^acetylglucosdminyltransf erase is eso^tusively responsible for the formation of the 
GlcNAc^1-*6 branch on Gal^1-«d GalNAc. 

1$ Although the invention has been descrit>ed with reference to the disclosed embodiments, it shouU be 
understood that various modifications can be made without departing from the spirit of the invention. 
Accordingly, the invention is Umlted only by the following claims. 
Lowe et al.. Cell 63:475-484 (1990) 
Brandley el al.. Coll 63:861*663 (1990) 

20 Phillips et aL, Saence 250: 1 1 30-1 1 32 (1 990) 
Waiz et al. Science 250:1132-1135 (1990) 
Higglns et al.. J, Bkil. Chem. 266:6280-6290 (1891) 
Schachter. Biochem, Cell Biol- 64:163-161 (1966) 
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SEQUENCE LISTING 



<X) GEHERAL IMPORMATIOM: 



(i) APPLICANT: 

(A) NAME: La JoXIa CAZLcer ftesearch Foundation 

(B) STREET t 1090X North Torrey Pines Road 
(G) CITY: La JoLLa 

(D) STATE: California 

(E) COUNTRY; U-S.A. 

(F) POSTAL CODE (ZIP): 92037 

(li) TITLE OP INVENTION t A NOVEL BETAl-6 

N-ACETYLGLUCOSAMINYLTRANSPERASE, ITS ACCEPTOR MOLECULE 
LEUKOSIALIN AND A METHOD FOR CLONING PROTEINS HAVINO 
ENZYMATIC ACTIVITY "wxr-jxio iwivj.ifi, 

(iii) NUMBER OF SEQUENCES: 8 

Civ) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
^ (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEMS PC-DOS /MS-DOS 

(D) SOFTVAEEs Patentin Raleaee #1.0, Version #1.25 (EPO) 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

90 

(ii) MOLECULE TYPE: DNA (genonic} 

(ix) FEATURE: 

(A) KAME/KEY5 CDS 
05 (B) LOCATION: 041.-900 

<ix) FEATURE) 

(A) NAME/KEY: exon 

(B) LOCATION: 91.. 192 

(D) OraER 1NFOR>UTION: /note- -EXON I'lS LOCATED IN BOTH 
^ GENOMIC AND cDNA- IN THE cDHA EXON 1" IS 

IMMEDIATELY FOLLOWED BY EXON 2.» 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 359.. 42S 

^ (D) OTHER INFORMATION: /note* 'EXON 1 IS LOCATED IN 

GENOMIC DNA^ 

(IX) FEATURE: 

(A) NAME/b:eYs intron 

(B) LOCATION: 193.. B06 

^ CD) OTHER INFORMATION: /note= -THIS SBCTIENT OF NUCLEIC 

ACID CONSTITUTES INTRON SEQUENCE OF THE cDNA' 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 607.. 900 

(D> OTHER INFORMATIONS /note-* "BXOH Z IS LOCATED IN BOTH 
GENOMIC AND cDMA. IN THE cDMA EXON 2 IMMEDIATELY 
FOLLOWS BXOK I',' 

(id) SEQUENCE DESCRIPTION! 5EQ ID N0:1: 

TTGGGGACCA CAAATGCAAA GGAAACCACC CTCCCCTCCC ACCTCCTCCT CTGCACCCtT 60 

GAGTTCTCAG GCTCAGAtTC CCACCACCCA CGTCTGA6CC CAGCCCtCCC fAGCATCACC 120 

ACTTCCATCC CATTCCTCAG CCAAGAGCCA G6AATCCTGA TTCCAGATCC CACGCTTCCC 160 

TGCCTCCCTC AGGTCACCCC CAGACCCCCA GGCACCCCGC TGGGGCCTGA AGGAGCAGOT 240 

GATGGTGCTG TCTTCCCCCA GCAGCTGTOG GAGCAGGCGG GXGGGGCAGG ATGCAGCGGT 300 

6GGT66GGTG GGTGGAGCCA GGGCCCACTT CCTTTCCCCT TGGGGCCCTG TCCTTCCCAO 360 

TCTTCCCCCA GCCTCGGGAG GTGGTGGAGT GACCTGGCCC CAGTGCTGCG TCCTTATCAG 420 

CCCACCC6GT AAGAGGGTGA GACTTGGTGG GGTAGGGGCC TCaGTGGGCC TGGGAATGTG 460 

CCTGTGGCTT GAAAAGACTC TGACAGGTTA TGATCGGAAG AGATTGGGAG CCATTGGGCT 540 

GCACAGCGTC AGGGAAGGCC ACGAGGGGCT GGTCACTGCT GGAATCTAAG CTGCTGAGGC 600 

TGGAGCGAGC CTCAGGATGG ggctgatgcg GGAGCTGCCA GCATCTGTTC CTCTGTCATT 660 

TCTGATAACA GTAAAAGCCA GCaTGGAAAA AACCGTTAAA CCGCAGGTTG GGCCTGGCGG 720 

TTGGCAGGGA AGTGGGCAGA GGGGAGGCCC GGCCAGGTCC TCCGGCAACT CCCGCGTGTT 760 

CTGCTTCTCC GGCTGCCCAC CTGCAGGTCC CAGCtCTTGC TCCTGCCTGT TTGCCTGGAA fl40 

ATG GCC AGG CTT CTC CTT CTC CTT GGG CtG CTG GT6 GTA AGC CCA GAC 868 
Het Ala Thr Leu Leu Leu Lda Lttu Gly V&l Leu V&l Val Sar Fro Asd' 
1 5 10 IS *^ 



GOT CTG GGG AGC 
AI& Leu Gly Ser 
20 



(2) INFORMATION FOR SEQ ID N0:2x 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 ASiino acids 
CB) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE? protein 

(xi) SEQUENCE DESCRIPTION t 5BQ ID NO: 2: 

Met Ala IhjT Leu Leu Leu Leu Leu Oly Val Leu Val Val Ser Pro Aap 
15 10 IS 



Ala Leu Gly Ser 
60 20 
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2S 



(2) INFOUCATION FOR SBQ ID NO: 3: 

<i) SEQUENCE CBARACTCRISTICS : 

(A) LENGTH: 2x05 base pairs 

(B) TYPE: nucleic acld 

(C) STRANDfiDNESS : both 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(Ix) FEATURE: 

(A) NAME/KEYf CDS 

(B) LOCATION t 220 1504 

(ix) FEATURE: 
« (A) NAME/KEY: polyA signal 

(H) LOCATION: 1913.71916 

Clx) FEATURE: 

(A) NAHE/KEY: mlsc signal 

(B) LOCATIONS 248*7314 

CD) OTHER INFORMATION! /standard na&e- 
«C - SIGNAL ^MEHBRANE-ANCHORING DOMAIN- 

(xi) SEQUENCE DESCRIPTION: SEQ ID K0:3: 

GTGAAGTGCT CAGAATGGGG CAGGATGTCA CCTGGAATCA GCACTAAGTG ATTCAGACTT 60 

TCCTTACTTT TAAATGTGCT GCTCTTCATT TCAAGATGCC GTTGCAGCTC TGATAAATGC 120 

AAACTGACAA CCTTCAAGGC CACGAGGGAG GGAAAATCAT TGGTGCTTGG AGCATAGAAG IBO 

ACTGCCCTTC ACAAAGGAAA TCCCTGATTA TTGTTTGAA ATG CTG AGG ACG TTG 234 

Ket Leu Arg Thr Leu 
15 

CTG CGA AGG AGA CTT TTT TCT TAT CCC ACC AAA TAG TAC TTT ATG GTT 202 
Leu Arg Arg Arg Leu Phe Ser Tyr Pro Thr Lys Tyr Tyr Phe Met Val 

10 15 20 

js CTT GTT TTA TCC CTA ATC ACC TTC TCC GTT TTA AGG ATT CAT CAA AAG 330 

Leu Val Leu Ser Leu He Thr Phe Ser Val Leu Are He Hla Gin Lva 
25 30 35 

COT GAA TTT GTA AGT GTC AGA CAC TTG GAG CTT GCT GGG GAG AAT CCT 378 
Pro Qlu Phe val Ser Val Arg HI 9 Leu Glu Leu Ala Gly Glu Asn Pro 
^ ^5 50 

ACT AGT GAT ATT AAT TGC ACC AAA GTT TTA CAG GCT GAT GTA AAT GAA 426 
Ser Ser Asp He Asn Cys Thr Lys Val Leu Glu Gly Asp Val Asn Glu 
55 60 55 

ATC CAA AAG GTA AAG CTT GAG ATC CTA ACA GTG AAA TTT AAA AAG CGC 474 
45 lie Gin Lys Val Lys Levi Glu He Leu Thr Val Ly$ Phe Lys Lys Arg 
70 75 eo 85 

CCT CGG IGG ACA CCT GAC GAC TAT ATA AAC ATG ACC AGT GAG TGT TCT 522 
Pro Arg Trp Thr Pro Asp Asp Tyr He Asn Met Thr Ser Asp Cys Ser 
^0 95 100 

60 
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TCT TTC ATC AAfi AGA CGC AAA TAT ATT GTA GAA CCC CTT ACT AAA GAA 570 
Ser PhB lie Lys Arg Arg Lys Tyr lie Val Giu Pro Leu Ser Lys Glu 
105 no 

^ GAG GCC GAG TTT CCA ATA GCA TAT TCT ATA GTG GTT CAT CAC AAG ATT 61fl 

Glu Ala Glu Phe pro lie Ala Tyr Ser 11a Val Val His His Lvs He 

120 12S 130 ^ 

GAA ATC CTT GAC AGG CTG CTG AGG GCC ATC TAT ATG CCT CAG AAT TTC 666 
Glu Ket Leu Asp Arg Leu Leu Arg Ala Il» Tyr Met Pro Gin Asn Pha 
TO 135 140 145 

TAT TGC GTT CAT GTG GAC ACA AAA TCC GAG CAT TCC TAT TTA GCT GCA 714 
Tyr Cys Val His Val Asp Thr Lys Ser Glu A6p Ser Tyr Leu Ala Ala 
150 155 160 165 

GTG ATG GGC ATC GCT TCC TGT TTT AGT AAT GTC TTT GTG GCC AGC CGA 762 
Val Met Gly lie Ala ser Cys Phe ser Asu Val Phe VbI Ala Ser Ara 
170 175 lao 

TTG GAG AGT GTG GTT TAT GCA TCG TGG AGC CCG GTT CAG GCT GAC CTC 610 
Leu Glu Ser Val Val Tyr Ala Ser Trp ser Arg Val Gin Ala Asp Leu 
185 190 195 

20 

AAC TGC ATG AAG GAT CTC TAT GCA ATC AGt GCA AAC TGG AAG tAC TTG 656 
Asn Cys Met Lys Asp Leu Tyr Ala Kec Ser Ala Asn Trp Lya Tyr Leu 
200 205 210 

ATA AAT CTT TGT 6GT ATG GAT TTT CCC ATT AAA ACC AAC CTA GAA ATT 906 
lie Asn Leu Cys Gly Met Aep Phe Pro lie Lys Thr Asn Leu Glu He 
215 220 225 

GTC AGG AAG CTC AAG TTG TTA ATG GGA GAA AAC AAC CTG GAA ACG GAG 954 
Val Arg Lys Leu Lys Leu Leu Met Gly Glu Asn Asn Leu Glu Thr Glu 
230 235 i40 245 

30 AGG ATG CCA TCC CAT AAA GAA GAA AGG TGG AAG AAG CGG TAT GAG GTC 1002 
Axg Met Pro Ser His Lys Glu Glu Arg Trp Lys Lys Arg Typ Glu Val 
250 255 260 



IS 



S5 



GTT AAT GGA AAG CTG ACA AAC ACA G6G ACT GTC AAA ATG CTT CCT CCA 1050 
val Asn Gly Lyc Leu Thr Asn Thr Gly Thr Val Lys Met Leu Pro Pro 
265 270 275 

CTC GAA ACA CCT CTC TTT TCT GGC AGT GCC TAG TTC GTG GTC AGT AGG 1096 
Leu Glu Thr Pro Leu Phe Ser Gly Sec Ala Tyr Fha Val Val Sar Arg 
260 285 290 

GAC TAT GTG GGG TAT GTA CTA CAG AAT GAA AAA ATC CAA AAG TTG ATG 1146 
40 Glu Tyr Val Gly Tyr Val Leu Gin Abu Glu Lys He Gin Lys Leu Met 
295 300 305 

GAG TGG GCA CAA GAC ACA TAC AGC CCT CAT GAG TAT CTC TGG GCC ACC 1194 
Glu Trp Ala Gin Asp Thr Tyr Ser Pro Asp Glu Tyr Leu Trp Ala Thr 
310 315 320 ^ 325 

^ ATC CAA AGG ATT CCT GAA GTC CCG GGC TCA CTC CCT GCC AGC CAT AAG 1242 
He Gin Arg Zle Fro Glu Val Pro Gly Ser Leu Pro Ala Ser His ^ 
330 335 340 

TAT GAT CTA TCT GAC ATG CAA GCA GTT GCC AGG TTT GTC AAG TGC CAG 1290 
Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg Phe Val Lys Trp Gin 
5Q 345 350 35S 
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10 



IS 



30 



2S 



30 



40 



4S 



60 



TAC TTT GAG GGT GAT GTT TCC AAG GGT GCT CCC TAC CCC rrr -Prr 
Tyr Phe GXu Gly Asp Val Ser Lys Oly All PrS ?J5 p5o pJS XtJ 

365 37Q ^ 

vl? St^ S^? !^ S^? A" TTC GGA GCT GCT GAG TTG AAC 

Gly Val His Val Arg S^r Val Cys Phe Cly Ala Gly A.p Leu 

TCG ATG CTG CGC AAA CAC CAC TTG TTT GCC AAT AAG TTT iSAfi fTr rA + 
. Trp M« Leu Arg Lys His His L^u Phe Ala Asn ^ 5" 2S Val a^ 

GTT GAC pTC TTT GCC ATC CA6 TGT TTG GAT GAG CAT TTG AGA CAC AAA 
val ASP Leu Phe Ala He Gin Cye Leu Asp Glu His III ^ 

GCT TTG GAG ACA TTA AAA CAC T GACCATTACG GGCAATTTTA TGAACAAGAA 
Ala Leu Glu Thr Laa Lys His AVF«*^i*a^AA 
423 

6AAGGATACA CAAAACGTAC CTTATCTGTT TCCCCTTCCT TGTCAGCGTC GGGAAGATGG 
TATGAAGTCC TC7TTGGGGC AGGGACTCTA 6TAGATCTTC TTGTCAGAGA AGCTGCATGG 
TTTCTGCAGA GCACAGTTAG CTAGAAAGGT OATACCATTA AATGTTCATC TAGAGTTAAT 
AGTGGGAGGA GTAAAGGTAG CCTTGAGGCC ACAGCAGGTA GCAAGGCATT GTGGAAAGAG 
GGGACCACGC TGGCTCGGGA AGACGCCGAT GCATAAAGTC ACCCTCTTCC AACTGCTCAG 
GGACTTAGCA AAATGAGAAG ATGTGACCTG TGCCAAAACT ATTtTGAGAA TtTTAAATGT 
gaccattttt CTGCTATGAA TAAACTTACA GCAACAAATA ATCAAAGATA CAATTAATCT 
CATATTATAT TTGTTGAAAT AGAAATTTGA TTGTACTATA AATGATTTTT GTAAATAATT 
TATATTCTGC TCTAATACTG TACTGTGTAC TGTGTCTCCG TATGTCATCT cagggagctt 
AAAAT6GGCT TGATTTAACA TTGAAAAAAA A 

(2) INFORMATION FOR SBQ ZD NO:4i 

(1) SEQU^3iCB CHARACTERISTICS: 

(A) LBNGTH: 426 amino acids 

(B) TYPEf aJBiinO acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCHZPTZONt SEQ ZD NO: 4: 

Met Leu Arg Thr Leu Leu Arg Arg Arg Leu Pha Ser Tyr Pro Thr Lys 
* " • 10 15 

Tyr Tyr Phe Met Val Leu Val Leu Ser Leu lie Thr Phe Ser Val Leu 
20 23 30 

Arg lie Bis Gin Lys Pro Glu Phe Val Ser Val Arg His Leu Glu Leu 
J3 40 ^5 

Ala Gly Glu Asn Pro Ser Ser Asp lie Aan Cya Thr Lys Val Leu Cln 

55 ^0 



1336 

1366 

1434 

1462 

1534 

1594 

1634 

1714 

1774 

1634 

1394 

1DS4 

2014 

2074 

2105 
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IP 



15 



20 



SO 



45 



SO 



Qly Asp Val Asu Glu He Glu lys Val Lys Leu Glu He Leu Thr Val 

^5 70 75 60 

Lys Phe Lys Lys Arg Pro Arg Trp Thr Pro Asp Asp Tyr He Asa Met 
85 90 

Thr Ser Aop Cye Sar Ser Phe He Lys Arg Arg Lys Tyr He Val Glu 
100 105 

Pro Leu Ser Lys Glu Glu Ala Glu phe Pro He Ala Tyr Ser He Val 
115 120 125 

Val His His Lys He Glu Het Leu Aep Arg Leu Lou Arg Ala He Tyr 



135 



140 



Met Pro Gla Asn Phe Tyr Cys Val Hie Val Asp Thr Lys Ser Glu Asp 

145 T 1 eip — r 



150 



155 



160 



Ser Tyr Leu Ala Ala val Met Gly He Ala Ser Cys Phe Ser Asn Val 

Phe Val Ala Ser Arg Leu Glu Ser Vai Val Tyr Ala Ser Trp Ser Are 
180 185 .190 

Val Gin Ala Asp Leu Aen Cys Met Lys Asp Leu Tyr Ala Het Ser Ala 
195 200 205 

Trp Lys Tyr Leu He A6n Leu Cys Gly Met Asp Phe Pro He Lys 



215 



220 



Thr ASn Leu Glu Ha Val Arg LyS Leu Lye Leu Leu Met Gly Glu Asu 



230 



235 



240 



Asn Leu Glu Thr Glu Arg Met Pro Ser His Lys Glu Glu Ara Tru Lvs 
2^5 250 * 255 

Lys Arg Tyr Glu Val Val Aen Gly Lys Leu Thr Asn Thr Gly Thr Val 



265 



270 



Lys Met Leu Pro Pro Leu Glu Thr Pro Leu Phe Ser Gly Ser Ala Tyr 
^75 280 285 

Phe Val Val Ser Arg Glu Tyr Val Gly Tyr Val Leu Gla ASn Glu Lys 

295 300 

He Gin Lye Leu Met Glu Trp Ale Gin Asp Thr Tyr Ser Pro Asp Glu 



315 



320 



Tyr Leu Trp Ala Thr He Glu Arg He Pro Glu Val Pro Gly Ser Leu 

325 330 ' 

Pro Ala ser ale Lys Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg 



350 



Phe Val Lys Trp Gin Tyr Phe Glu Gly Asp Val Ser Lys Gly Ala Pro 



360 



365 



Tyr Pro Pro Cys Asp Gly Val &is Val Arg Ser Val Cys He Phe Gly 



375 



3B0 

Ala Gly Asp Leu Asn Trp Met Leu Arg Lys His His Leu Phe Ala Asn 
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395 



400 
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Lys Phe Asp Val A^p Val Aap Leu Phe Ala He Gin Cys Leu A»p Glu 
405 410 415 

His Leu Arg Uis Lys Ala Leu Glu Thr Leu Lys His 
5 420 425 



(2) tNFORMATZON FOR 5EQ ID NO: 5: 
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(1) SEQUENCE CKARACTERZSTICS: 
(A) LEHGTHj 34 base pairs 
<B) TYTBi nucleic ACid 

(C) 5TRAMDEDNESS : single 

(D) TOPOLOGY: linear 

(il> MOLECULE TYPE; c2>NA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID N0;5: 
TTTGAATTCC CCTGAAMTC TAACTGTCAG ACAC 

(2) INPORMATtON FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 
<C} STRANDEDNESS : single 
<D> TOPOLOGY: linear 

(ii) 1«)LECULE TYPE: cDNA 

(ici) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTTGAATTCG CAGAAACCAT GCAGCTTCTC TGA 

(2> INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH t 15 base pairs 
(B> TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY! linear 

(li) MOLECULE m£: protein 
(V) FRAGMENT TYPEs internal 

(ix) FEATURE J 

(A) NAME/KEY? CDS 

(B) LOCATION: 1. .15 

CD) ^"^^JNFORMATION: /note- -PROTEIN A - C2GHT FUSION 
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(xi) SEQUENCE DKSCRIPTIOM: SEQ ID NO: 7: 

GOG AAT TCC CCT GAA 
Gly Asn 5er Pro Clu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 8; 

(1) SEQUENCE CflARACTERXSTICS; 

(A) LENGTHS 5 amino acid^ 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: proteiti 

(aci> SEQUENCE DESCRIPTION: SEQ ID NO;Q; 

Glj Asn Ser Pro Glu 
1 5 



Claims 

25 1. A purified human protein or an active fragmem thereof having /ii-e W^celylglucosafninyliransferase 
activity. 

2. The purified protein of claim 1 , wherein said activity ie that of UOP-6lcNAc:G»aWl-*3GalNAc (GlcNAc to 
GalNAc) fi^^e AAacetylgtucosaminyltranslerase. 



30 



40 



3. The purified protein of ciaim 2. wherein said protein has a relative molecular weight of about 50 kO. 

4. An isolated nucleic acid encoding the human protein or active fragment thereof of claim 1. 
35 6, A vector containing the nucleic acid of claim 4. 

6- The vector of claim 5. wherein said vector is a plasmid. 
7. The vector of claim 5, wherein said vector b r)cDNAl-C2QnT. 
a* A host cell containing the vector of claim S. 

9. A purified human pmtein or a fragment thereof that is an acceptor molecule, said acceptor molecule 
being acted upon by the protein of claim 2 having acCvity which exclusively forms core 2 oliaosac- 
chande structures in Oglycans. 

ia The acceptor molecule of claim 9. wherein said acceptor molecufe 1$ leukoslafin, CD43. 

11. An isolated nucleic acid encoding the acceptor molecule of claim a 

12. A vector containing the nucleic acid of claim 11. 

13. The vector of claim 12. wherein said vector Is a plasmid. 
55 14. The vector ol claim i2, wherein said vector is pcOSFkiHeu. 

IS. A host cell containing the vector of claim 12. 

2$ 
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16. A method of obtaining from a celi line, which <toes not normally contain a protein having catalytic 
activity or an acceptor molecule for said protein, a nucteic acid encoding said proiein having catalytic 
activity comprising: 

a. iransfeciing said cell line with a DNA sequence encoding the acceptor molecule, wherefn the 
acceptor molecula i$ stably expnessad in (he ceil line; 

b. transrecting said cell line with a cDNA library containing said nucleic acid in a vector, wherein 
protefna encoded by the transfected cDNA are transieniiy expressed; 

c. screening the transfected cells for exprassion of said protein having catalytic activity; and 

d. idOiBting the nucleic acid encoding the protein having catalytic activity. 

17. The vector of daim 16. wherein said vector replicates in the transfected ceil line, 
la. The vector in claim 17, wherein said vector is a plasmld. 

IS 18. The vector of claim 16. wherein said vector contains a viral replication origin. 

20. The vector of claim 19. whorein said replication origin is the polyoma virus replication origin. 

21. The cell line of claim 16, wherein said cell line supports replication of a vector. 

20 

22. The Cefi line of claim 16. wherein said cell line ekpresses polyoma virus large T antigen. 

23. The cell line of claim 16, wherein said cell line is the Chinese hamster ovary cell line* 
26 24. The cell line of claim 23. wherein said cell line is CHQ-Py-ieu. 

25. A method of isolating a polypeptide having catalytic activity that forms core 2 oligosaccharide 
structures in O-glycans, said method comprising growing the host cell of claim 6 under conditions 
Which favor expression of a nucleic acid encoding said polypeptide, and isolating said polyDeotide so 
00 produced. #»^km«« ^ 
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for cloning proteins having enzymatic activity. ^ 

0 The present invention provides a novel N- 
acetylglucosaminyltransferase, which forms coie 2 
oligosaccharide structures in O-glycans. and a novel 
acceptor molecule, leukosialin, CD43. for core 2 
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amino add sequences and nucleic add sequences 
encoding these molecules, as well as active frag- 
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isolsUng nucleic add seqiuBncQs encoding proteins 
having ensymatic activity is disclosed, using CHO 
cells that support replication of plasmid vectors hav* 
•ng a polyoma virus origin of replication. A method to 
obta/n a suitable cell line that expresses an acceptor 
molecule also rs disclosed. 
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